import tensorflow as tf
Advanced Deep Learning with TensorFlow 2 and Keras(-ing)
print(tf.__version__)
2.11.0
from tensorflow.keras import backend as K
print(K.epsilon())
1e-07
ref
- Advanced Deep Learning with TensorFlow 2 and Keras
- 텐서플로우 2와 케라스를 이용한 고급 딥러닝 - DL, GAN, VAE 심층 RL, 비지도 학습, 객체 감지 및 분할 등 적용 - git hub
1장 Keras를 이용한 고급 심층 학습 소개
MLP
- 멀티레이어 퍼셉트론(Multilayer Perceptrom)
: 완전 연결 네트워크, 심층 피드-포워드망, 피드-포워드 신경망
CNN
RNN
3장 오토인코더
-
인코더: 입력 \(x\) 를 낮은 차원의 텐서 벡터 \(z=f(x)\)로 변환
- MNIST숫자에서 학습할 특징: 필기 스타일, 기울기 각도, 획의 둥근 정도, 두께 등
-
디코더: 잠재 벡터, \(g(z) = \tilde{x}\) 로부터 입력 복원
- \(\tilde{x}\) 가 $ x $와 가까워지도록 하는것이 목표
-
인코더와 디코더는 비선형 함수
-
오토인코더를 MLP또는 CNN으로 구현 가능
-
역전파를 통한 손실 함수를 최소화하여 훈련
- 오토 인코더의 손실 함수
\[L = -log p(x|z)\]
\[ L(x, \tilde{x}) = MSE = \frac{1}{m} \sum _{i=1} ^{i=m} (x _{i} - {\tilde{x _{i}}} ) ^{2}\]
- m: 출력의 차원 (MNIST에서 m=폭x높이x채널=28x28x1=784)
오토인코더 구축
'''Example of autoencoder model on MNIST dataset
This autoencoder has modular design. The encoder, decoder and autoencoder
are 3 models that share weights. For example, after training the
autoencoder, the encoder can be used to generate latent vectors
of input data for low-dim visualization like PCA or TSNE.
'''
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.layers import Conv2D, Flatten
from tensorflow.keras.layers import Reshape, Conv2DTranspose
from tensorflow.keras.models import Model
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import plot_model
from tensorflow.keras import backend as K
import numpy as np
import matplotlib.pyplot as plt
# load MNIST dataset
= mnist.load_data()
(x_train, _), (x_test, _)
# reshape to (28, 28, 1) and normalize input images
= x_train.shape[1]
image_size = np.reshape(x_train, [-1, image_size, image_size, 1])
x_train = np.reshape(x_test, [-1, image_size, image_size, 1])
x_test = x_train.astype('float32') / 255
x_train = x_test.astype('float32') / 255
x_test
# network parameters
= (image_size, image_size, 1)
input_shape = 32
batch_size = 3
kernel_size = 16
latent_dim # encoder/decoder number of CNN layers and filters per layer
= [32, 64]
layer_filters
# build the autoencoder model
# first build the encoder model
= Input(shape=input_shape, name='encoder_input')
inputs = inputs
x # stack of Conv2D(32)-Conv2D(64)
for filters in layer_filters:
= Conv2D(filters=filters,
x =kernel_size,
kernel_size='relu',
activation=2,
strides='same')(x)
padding
# shape info needed to build decoder model
# so we don't do hand computation
# the input to the decoder's first
# Conv2DTranspose will have this shape
# shape is (7, 7, 64) which is processed by
# the decoder back to (28, 28, 1)
= K.int_shape(x)
shape
# generate latent vector
= Flatten()(x)
x = Dense(latent_dim, name='latent_vector')(x)
latent
# instantiate encoder model
= Model(inputs,
encoder
latent,='encoder')
name
encoder.summary()
plot_model(encoder,='encoder.png',
to_file=True)
show_shapes
# build the decoder model
= Input(shape=(latent_dim,), name='decoder_input')
latent_inputs # use the shape (7, 7, 64) that was earlier saved
= Dense(shape[1] * shape[2] * shape[3])(latent_inputs)
x # from vector to suitable shape for transposed conv
= Reshape((shape[1], shape[2], shape[3]))(x)
x
# stack of Conv2DTranspose(64)-Conv2DTranspose(32)
for filters in layer_filters[::-1]:
= Conv2DTranspose(filters=filters,
x =kernel_size,
kernel_size='relu',
activation=2,
strides='same')(x)
padding
# reconstruct the input
= Conv2DTranspose(filters=1,
outputs =kernel_size,
kernel_size='sigmoid',
activation='same',
padding='decoder_output')(x)
name
# instantiate decoder model
= Model(latent_inputs, outputs, name='decoder')
decoder
decoder.summary()='decoder.png', show_shapes=True)
plot_model(decoder, to_file
# autoencoder = encoder + decoder
# instantiate autoencoder model
= Model(inputs,
autoencoder
decoder(encoder(inputs)),='autoencoder')
name
autoencoder.summary()
plot_model(autoencoder,='autoencoder.png',
to_file=True)
show_shapes
# Mean Square Error (MSE) loss function, Adam optimizer
compile(loss='mse', optimizer='adam')
autoencoder.
# train the autoencoder
autoencoder.fit(x_train,
x_train,=(x_test, x_test),
validation_data=1,
epochs=batch_size)
batch_size
# predict the autoencoder output from test data
= autoencoder.predict(x_test)
x_decoded
# display the 1st 8 test input and decoded images
= np.concatenate([x_test[:8], x_decoded[:8]])
imgs = imgs.reshape((4, 4, image_size, image_size))
imgs = np.vstack([np.hstack(i) for i in imgs])
imgs
plt.figure()'off')
plt.axis('Input: 1st 2 rows, Decoded: last 2 rows')
plt.title(='none', cmap='gray')
plt.imshow(imgs, interpolation'input_and_decoded.png')
plt.savefig( plt.show()
Model: "encoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
encoder_input (InputLayer) [(None, 28, 28, 1)] 0
conv2d (Conv2D) (None, 14, 14, 32) 320
conv2d_1 (Conv2D) (None, 7, 7, 64) 18496
flatten (Flatten) (None, 3136) 0
latent_vector (Dense) (None, 16) 50192
=================================================================
Total params: 69,008
Trainable params: 69,008
Non-trainable params: 0
_________________________________________________________________
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
Model: "decoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
decoder_input (InputLayer) [(None, 16)] 0
dense (Dense) (None, 3136) 53312
reshape (Reshape) (None, 7, 7, 64) 0
conv2d_transpose (Conv2DTra (None, 14, 14, 64) 36928
nspose)
conv2d_transpose_1 (Conv2DT (None, 28, 28, 32) 18464
ranspose)
decoder_output (Conv2DTrans (None, 28, 28, 1) 289
pose)
=================================================================
Total params: 108,993
Trainable params: 108,993
Non-trainable params: 0
_________________________________________________________________
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
Model: "autoencoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
encoder_input (InputLayer) [(None, 28, 28, 1)] 0
encoder (Functional) (None, 16) 69008
decoder (Functional) (None, 28, 28, 1) 108993
=================================================================
Total params: 178,001
Trainable params: 178,001
Non-trainable params: 0
_________________________________________________________________
You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) for plot_model to work.
1875/1875 [==============================] - 11s 6ms/step - loss: 0.0212 - val_loss: 0.0104
313/313 [==============================] - 1s 2ms/step
- 인코더 모델은 낮은 차원의 잠재 벡터를 생성하기 위해서 Conv2D(32)-Conv2D(64)-Dense(16)으로 구성
- 디코더 모델은 Dense(16)-Conv2DTranspose(64)-Conv2DTranspose(32)-Conv2DTranspose(1)으로 구성
- 입력은 원본 입력을 복원하기 위한 디코딩된 잠재 벡터
잠재벡터 시각화
-
잠재 코드 차원
- 숫자 0: 왼쪽 아래 사분면
- 숫자 1: 오른쪽 위 사분면
노이즈 제거 오토인코더(DAE)
노이즈 제거 Denoising
\[ x = x_{orig} + noise \]
인코더의 목적: 잠재벡터인 \(z\)를 생성하는 방법을 찾는 것 \(\to\) 잠재 벡터를 디코더가 MSE와 같은 손실 함수의 비 유사성을 최소화하여 \(x_{orig}\)로 복원
\[ L(x_{orig}, \tilde{x}) = MSE = \frac{1}{m} \sum _{i=1} ^{i=m} (x _{origi} - {\tilde{x _{i}}} ) ^{2}\]
'''Trains a denoising autoencoder on MNIST dataset.
Denoising is one of the classic applications of autoencoders.
The denoising process removes unwanted noise that corrupted the
true data.
Noise + Data ---> Denoising Autoencoder ---> Data
Given a training dataset of corrupted data as input and
true data as output, a denoising autoencoder can recover the
hidden structure to generate clean data.
This example has modular design. The encoder, decoder and autoencoder
are 3 models that share weights. For example, after training the
autoencoder, the encoder can be used to generate latent vectors
of input data for low-dim visualization like PCA or TSNE.
'''
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.layers import Conv2D, Flatten
from tensorflow.keras.layers import Reshape, Conv2DTranspose
from tensorflow.keras.models import Model
from tensorflow.keras import backend as K
from tensorflow.keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
1337)
np.random.seed(
# load MNIST dataset
= mnist.load_data()
(x_train, _), (x_test, _)
# reshape to (28, 28, 1) and normalize input images
= x_train.shape[1]
image_size = np.reshape(x_train, [-1, image_size, image_size, 1])
x_train = np.reshape(x_test, [-1, image_size, image_size, 1])
x_test = x_train.astype('float32') / 255
x_train = x_test.astype('float32') / 255
x_test
# generate corrupted MNIST images by adding noise with normal dist
# centered at 0.5 and std=0.5
= np.random.normal(loc=0.5, scale=0.5, size=x_train.shape)
noise = x_train + noise
x_train_noisy = np.random.normal(loc=0.5, scale=0.5, size=x_test.shape)
noise = x_test + noise
x_test_noisy
# adding noise may exceed normalized pixel values>1.0 or <0.0
# clip pixel values >1.0 to 1.0 and <0.0 to 0.0
= np.clip(x_train_noisy, 0., 1.)
x_train_noisy = np.clip(x_test_noisy, 0., 1.)
x_test_noisy
# network parameters
= (image_size, image_size, 1)
input_shape = 32
batch_size = 3
kernel_size = 16
latent_dim # encoder/decoder number of CNN layers and filters per layer
= [32, 64]
layer_filters
# build the autoencoder model
# first build the encoder model
= Input(shape=input_shape, name='encoder_input')
inputs = inputs
x
# stack of Conv2D(32)-Conv2D(64)
for filters in layer_filters:
= Conv2D(filters=filters,
x =kernel_size,
kernel_size=2,
strides='relu',
activation='same')(x)
padding
# shape info needed to build decoder model so we don't do hand computation
# the input to the decoder's first Conv2DTranspose will have this shape
# shape is (7, 7, 64) which can be processed by the decoder back to (28, 28, 1)
= K.int_shape(x)
shape
# generate the latent vector
= Flatten()(x)
x = Dense(latent_dim, name='latent_vector')(x)
latent
# instantiate encoder model
= Model(inputs, latent, name='encoder')
encoder
encoder.summary()
# build the decoder model
= Input(shape=(latent_dim,), name='decoder_input')
latent_inputs # use the shape (7, 7, 64) that was earlier saved
= Dense(shape[1] * shape[2] * shape[3])(latent_inputs)
x # from vector to suitable shape for transposed conv
= Reshape((shape[1], shape[2], shape[3]))(x)
x
# stack of Conv2DTranspose(64)-Conv2DTranspose(32)
for filters in layer_filters[::-1]:
= Conv2DTranspose(filters=filters,
x =kernel_size,
kernel_size=2,
strides='relu',
activation='same')(x)
padding
# reconstruct the denoised input
= Conv2DTranspose(filters=1,
outputs =kernel_size,
kernel_size='same',
padding='sigmoid',
activation='decoder_output')(x)
name
# instantiate decoder model
= Model(latent_inputs, outputs, name='decoder')
decoder
decoder.summary()
# autoencoder = encoder + decoder
# instantiate autoencoder model
= Model(inputs, decoder(encoder(inputs)), name='autoencoder')
autoencoder
autoencoder.summary()
# Mean Square Error (MSE) loss function, Adam optimizer
compile(loss='mse', optimizer='adam')
autoencoder.
# train the autoencoder
autoencoder.fit(x_train_noisy,
x_train,=(x_test_noisy, x_test),
validation_data=10,
epochs=batch_size)
batch_size
# predict the autoencoder output from corrupted test images
= autoencoder.predict(x_test_noisy)
x_decoded
# 3 sets of images with 9 MNIST digits
# 1st rows - original images
# 2nd rows - images corrupted by noise
# 3rd rows - denoised images
= 3, 9
rows, cols = rows * cols
num = np.concatenate([x_test[:num], x_test_noisy[:num], x_decoded[:num]])
imgs = imgs.reshape((rows * 3, cols, image_size, image_size))
imgs = np.vstack(np.split(imgs, rows, axis=1))
imgs = imgs.reshape((rows * 3, -1, image_size, image_size))
imgs = np.vstack([np.hstack(i) for i in imgs])
imgs = (imgs * 255).astype(np.uint8)
imgs
plt.figure()'off')
plt.axis('Original images: top rows, '
plt.title('Corrupted Input: middle rows, '
'Denoised Input: third rows')
='none', cmap='gray')
plt.imshow(imgs, interpolation'corrupted_and_denoised.png')
Image.fromarray(imgs).save( plt.show()
Model: "encoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
encoder_input (InputLayer) [(None, 28, 28, 1)] 0
conv2d_4 (Conv2D) (None, 14, 14, 32) 320
conv2d_5 (Conv2D) (None, 7, 7, 64) 18496
flatten_2 (Flatten) (None, 3136) 0
latent_vector (Dense) (None, 16) 50192
=================================================================
Total params: 69,008
Trainable params: 69,008
Non-trainable params: 0
_________________________________________________________________
Model: "decoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
decoder_input (InputLayer) [(None, 16)] 0
dense_2 (Dense) (None, 3136) 53312
reshape_2 (Reshape) (None, 7, 7, 64) 0
conv2d_transpose_4 (Conv2DT (None, 14, 14, 64) 36928
ranspose)
conv2d_transpose_5 (Conv2DT (None, 28, 28, 32) 18464
ranspose)
decoder_output (Conv2DTrans (None, 28, 28, 1) 289
pose)
=================================================================
Total params: 108,993
Trainable params: 108,993
Non-trainable params: 0
_________________________________________________________________
Model: "autoencoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
encoder_input (InputLayer) [(None, 28, 28, 1)] 0
encoder (Functional) (None, 16) 69008
decoder (Functional) (None, 28, 28, 1) 108993
=================================================================
Total params: 178,001
Trainable params: 178,001
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
1875/1875 [==============================] - 11s 6ms/step - loss: 0.0367 - val_loss: 0.0205
Epoch 2/10
1875/1875 [==============================] - 11s 6ms/step - loss: 0.0193 - val_loss: 0.0180
Epoch 3/10
1875/1875 [==============================] - 11s 6ms/step - loss: 0.0176 - val_loss: 0.0172
Epoch 4/10
1875/1875 [==============================] - 11s 6ms/step - loss: 0.0168 - val_loss: 0.0166
Epoch 5/10
1875/1875 [==============================] - 11s 6ms/step - loss: 0.0163 - val_loss: 0.0163
Epoch 6/10
1875/1875 [==============================] - 11s 6ms/step - loss: 0.0160 - val_loss: 0.0161
Epoch 7/10
1875/1875 [==============================] - 11s 6ms/step - loss: 0.0157 - val_loss: 0.0160
Epoch 8/10
1875/1875 [==============================] - 11s 6ms/step - loss: 0.0154 - val_loss: 0.0160
Epoch 9/10
1875/1875 [==============================] - 11s 6ms/step - loss: 0.0153 - val_loss: 0.0157
Epoch 10/10
1875/1875 [==============================] - 11s 6ms/step - loss: 0.0151 - val_loss: 0.0156
313/313 [==============================] - 1s 2ms/step
자동 채색 오토인코더
해당 코딩은 너무 길어서 생략. 자세한 것은 여기 링크 참고 \(\to\) 자동 채색 오토인코더
입력: 회색도 사진, 출력: 해당하는 채색된 사진들로 오토인코더를 훈련
요약
- 노이즈 제거, 채색 등 구조적인 변환을 효율적으로 하기 위하여 데이터를 낮은 차원의 표현으로 압축하는 신경망
4장 생성적 적대 신경망(GAN)
-
생성자(generator): 판별자를 속일 수 있는 가짜 데이터 신호를 생성하는 방법에 대해 지속적으로 알아내는 것
-
판별자(discriminator): 가짜와 실제 신호를 구분하도록 훈련
- 유요한 데이터는 1.0으로 레이블링 되고 합성된 데이터는 0.0(진짜일 확률 0%)로 레이블ㄹ이 된다.
생성기 개발을 위한 클래스
class Generator:
def __init__(self):
self.initVariable = 1
def lossFunction(self):
return
def buldModel(self):
return
def trainModel(self, inputX, inputY):
return
판별기 개발을 위한 클래스
class Discriminator:
def __init__(self):
self.initVariable = 1
def lossFunction(self):
return
def buildModel(self):
return
def trainModel(self,inputX,inputY):
return
손실 함수
class Loss:
def __init__(self):
self.initVariable = 1
def lossBaseFunction1(self):
return
def lossBaseFunction2(self):
return
def lossBaseFunction3(self):
return
-
적대적 훈련을 할 때 생성기에서 사용하는 손실함수
\[\nabla \theta_g \sum_{i=1}^{m} log(1-D(G(z^{(i)}))) \]
-
GAN에서 적용되는 표준 교차 엔트로피 구현
\[ \nabla \theta_d \dfrac{1}{m} \sum_{i=1}^{m} [logD(x^{(i)})+log(1-D(G(z^{(i)})))] \]
- 굿펠로우 논문에 나오는 함수
note: 교수님이 설명해주신 코드 보는게 더 나을듯 하다. 교수님 코드
DCGAN
- 심층 CNN을 이용하여 초기 GAN를 성공적으로 구현
조건부(Conditional) GAN
- 원-핫 벡터를 제외하면 DCGAN과 유사
- 생성자와 판별자의 출력에 조건을 부여하기 위해 원-핫 벡터 사용
5장 향상된 GAN
Wasserstein GAN
GAN의 불안정성은 Jensen-Shannon (JS) 거리에 기초한 손실함수 때문이라고 주장
GAN의 최적화에 더 알맞게 JS거리 함수를 대체하기에 적합한 것을 찾아야함
ref: https://lilianweng.github.io/posts/2017-08-20-gan/
- 책 개념이 너무 어렵당.. 관련 수식에 대해 이해하고 싶은뎀 수식에 대한 내용이 자세하진 않음.. 일단 수식에 대한 내용이해 먼저 하고 추후에 다시 책 읽어보는 걸로~~